Dataset for the First Evaluation on Chinese Machine Reading Comprehension

نویسندگان

  • Yiming Cui
  • Ting Liu
  • Zhipeng Chen
  • Wentao Ma
  • Shijin Wang
  • Guoping Hu
چکیده

Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attentions. However, existing reading comprehension datasets are mostly in English. To add diversity in reading comprehension datasets, in this paper we propose a new Chinese reading comprehension dataset for accelerating related research in the community. The proposed dataset contains two different types: cloze-style reading comprehension and user query reading comprehension, associated with large-scale training data as well as human-annotated validation and hidden test set. Along with this dataset, we also hosted the first Evaluation on Chinese Machine Reading Comprehension (CMRC-2017) and successfully attracted tens of participants, which suggest the potential impact of this dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

In this paper, we introduce DuReader, a new large-scale, open-domain Chinese machine reading comprehension (MRC) dataset, aiming to tackle real-world MRC problems. In comparison to prior datasets, DuReader has the following characteristics: (a) the questions and the documents are all extracted from real application data, and the answers are human generated; (b) it provides rich annotations for ...

متن کامل

RACE: Large-scale ReAding Comprehension Dataset From Examinations

We present RACE, a new dataset for benchmark evaluation of methods in the reading comprehension task. Collected from the English exams for middle and high school Chinese students in the age range between 12 to 18, RACE consists of near 28,000 passages and near 100,000 questions generated by human experts (English instructors), and covers a variety of topics which are carefully designed for eval...

متن کامل

Consensus Attention-based Neural Networks for Chinese Reading Comprehension

Reading comprehension has embraced a booming in recent NLP research. Several institutes have released the Cloze-style reading comprehension data, and these have greatly accelerated the research of machine comprehension. In this work, we firstly present Chinese reading comprehension datasets, which consist of People Daily news dataset and Children’s Fairy Tale (CFT) dataset. Also, we propose a c...

متن کامل

The Effects of Concept Mapping and Critical Thinking Test Strategies on EFL Iranian Learners' Reading Comprehension

This research was designed to investigate the effects of critical thinking and concept mappingtest strategies on EFL Iranian learners' reading comprehension in five sessions during near twomonths. Fifty eight male and female students took a pre-test as a language proficiency test first,then they were randomly assigned to two experimental groups and a control group, twentystudents for critical t...

متن کامل

The Effects of Gender Differences and Schema-Based Pre-reading Activities on Reading Comprehension Skill

This study was designed to investigate the effects of gender and Schema-based pre-reading activities on the Iranian EFL learners’ reading comprehension. The sample consisted of 60 male and female students studying at second-grade high school in Abhar city. Two reading passages (“Charles Dickens and the Little Children”, and “Hic, Hic, Hic”) were randomly selected from second-grade English textb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.08299  شماره 

صفحات  -

تاریخ انتشار 2017